Skip to main content

PostgreSQL Comprehensive Notes

Table of Contents


What is PostgreSQL?

PostgreSQL (often called "Postgres") is a powerful, open-source object-relational database management system (ORDBMS) that has been in active development for over 30 years. It's known for its reliability, feature robustness, and performance.

Key Characteristics

  • ACID Compliant: Ensures data integrity through Atomicity, Consistency, Isolation, and Durability
  • Open Source: Free to use with a permissive license
  • Cross-Platform: Runs on Linux, Windows, macOS, and other Unix-like systems
  • Extensible: Supports custom data types, functions, and extensions
  • Standards Compliant: Follows SQL standards more closely than most databases

Why Use PostgreSQL?

1. Advanced Data Types

  • JSON and JSONB for document storage
  • Arrays, ranges, and geometric types
  • Custom data types
  • Full-text search capabilities

2. Performance & Scalability

  • Multi-version concurrency control (MVCC)
  • Advanced indexing (B-tree, Hash, GiST, SP-GiST, GIN, BRIN)
  • Table partitioning
  • Parallel queries
  • Write-ahead logging (WAL)

3. Reliability & Data Integrity

  • ACID compliance
  • Foreign key constraints
  • Check constraints
  • Transactions with savepoints
  • Point-in-time recovery

4. Extensibility

  • Custom functions in multiple languages (PL/pgSQL, Python, JavaScript, etc.)
  • Extensions ecosystem
  • Custom operators and data types
  • Procedural languages

5. Security Features

  • Row-level security
  • SSL connections
  • Authentication methods (MD5, SCRAM, Kerberos, LDAP)
  • Role-based access control

PostgreSQL vs MySQL

FeaturePostgreSQLMySQL
ACID ComplianceFull ACID complianceACID with InnoDB engine
Data TypesRich set including JSON, arrays, rangesBasic types, limited JSON support
ConcurrencyMVCC (better read performance)Locking-based
Standards ComplianceHighly SQL standard compliantLess compliant
ReplicationStreaming, logical replicationMaster-slave, master-master
Full-text SearchBuilt-in advanced searchBasic FULLTEXT indexes
ExtensibilityHighly extensibleLimited extensibility
PerformanceBetter for complex queriesBetter for simple read-heavy workloads
Learning CurveSteeperGentler
CommunityStrong academic/enterprise focusWeb development focus

When to Choose PostgreSQL over MySQL

Choose PostgreSQL when you need:

  • Complex queries and data analysis
  • Data integrity is critical
  • Advanced data types (JSON, arrays, etc.)
  • Full-text search capabilities
  • Geographic data (PostGIS extension)
  • Extensibility and customization
  • Strong consistency requirements
  • ACID compliance across all operations

Choose MySQL when you need:

  • Simple web applications
  • Read-heavy workloads
  • Faster development cycle
  • Simpler administration
  • Better performance for basic operations

PostgreSQL Features

1. Data Types

  • Numeric: INTEGER, BIGINT, DECIMAL, NUMERIC, REAL, DOUBLE PRECISION
  • Character: CHAR, VARCHAR, TEXT
  • Date/Time: DATE, TIME, TIMESTAMP, INTERVAL
  • Boolean: BOOLEAN
  • Binary: BYTEA
  • Network: INET, CIDR, MACADDR
  • Geometric: POINT, LINE, CIRCLE, POLYGON
  • JSON: JSON, JSONB
  • Arrays: Any data type can be an array
  • Ranges: INT4RANGE, TSRANGE, etc.
  • Custom Types: User-defined types

2. Advanced Features

  • Window Functions: ROW_NUMBER(), RANK(), LAG(), LEAD()
  • Common Table Expressions (CTEs): WITH queries
  • Recursive Queries: Hierarchical data processing
  • Table Inheritance: Object-oriented database features
  • Materialized Views: Pre-computed query results
  • Triggers: Row and statement-level triggers
  • Stored Procedures: Functions in multiple languages

3. Indexing

  • B-tree: Default index type
  • Hash: For equality operations
  • GiST: Generalized search trees
  • SP-GiST: Space-partitioned GiST
  • GIN: Generalized inverted indexes
  • BRIN: Block range indexes
  • Partial Indexes: Indexes on subsets of data
  • Functional Indexes: Indexes on expressions

4. Replication & High Availability

  • Streaming Replication: Real-time data streaming
  • Logical Replication: Publication/subscription model
  • Hot Standby: Read queries on standby servers
  • Point-in-Time Recovery: Restore to any point in time
  • Connection Pooling: Built-in connection management

PostgreSQL Commands by Category

Database Operations

Creating and Managing Databases

-- Create database
CREATE DATABASE mydatabase;
CREATE DATABASE mydatabase WITH ENCODING 'UTF8';

-- Drop database
DROP DATABASE mydatabase;

-- List databases
\l

-- Connect to database
\c mydatabase

-- Show current database
SELECT current_database();

Table Operations

Creating Tables

-- Basic table creation
CREATE TABLE users (
id SERIAL PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(100) UNIQUE NOT NULL,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Table with constraints
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
user_id INTEGER REFERENCES users(id),
amount DECIMAL(10,2) CHECK (amount > 0),
status VARCHAR(20) DEFAULT 'pending'
);

-- Table with arrays
CREATE TABLE tags (
id SERIAL PRIMARY KEY,
name VARCHAR(50),
categories TEXT[]
);

Altering Tables

-- Add column
ALTER TABLE users ADD COLUMN phone VARCHAR(20);

-- Drop column
ALTER TABLE users DROP COLUMN phone;

-- Rename column
ALTER TABLE users RENAME COLUMN username TO user_name;

-- Change column type
ALTER TABLE users ALTER COLUMN email TYPE TEXT;

-- Add constraint
ALTER TABLE users ADD CONSTRAINT unique_email UNIQUE (email);

-- Drop constraint
ALTER TABLE users DROP CONSTRAINT unique_email;

Table Information

-- List tables
\dt

-- Describe table structure
\d users

-- Show table with details
\d+ users

-- Show table constraints
\d+ users

Data Manipulation

INSERT Operations

-- Basic insert
INSERT INTO users (username, email) VALUES ('john_doe', 'john@example.com');

-- Multiple inserts
INSERT INTO users (username, email) VALUES
('jane_doe', 'jane@example.com'),
('bob_smith', 'bob@example.com');

-- Insert with returning
INSERT INTO users (username, email)
VALUES ('alice', 'alice@example.com')
RETURNING id, created_at;

-- Insert from another table
INSERT INTO backup_users SELECT * FROM users WHERE created_at < '2023-01-01';

SELECT Operations

-- Basic select
SELECT * FROM users;
SELECT username, email FROM users;

-- With conditions
SELECT * FROM users WHERE created_at > '2023-01-01';
SELECT * FROM users WHERE username LIKE 'john%';

-- Ordering and limiting
SELECT * FROM users ORDER BY created_at DESC LIMIT 10;
SELECT * FROM users ORDER BY username OFFSET 5 LIMIT 10;

-- Aggregations
SELECT COUNT(*) FROM users;
SELECT COUNT(*), AVG(amount) FROM orders;
SELECT status, COUNT(*) FROM orders GROUP BY status;

-- Joins
SELECT u.username, o.amount
FROM users u
JOIN orders o ON u.id = o.user_id;

-- Window functions
SELECT username,
ROW_NUMBER() OVER (ORDER BY created_at) as row_num,
RANK() OVER (ORDER BY created_at) as rank
FROM users;

UPDATE Operations

-- Basic update
UPDATE users SET email = 'newemail@example.com' WHERE id = 1;

-- Update multiple columns
UPDATE users SET
username = 'new_username',
email = 'new@example.com'
WHERE id = 1;

-- Update with returning
UPDATE users SET username = 'updated_user'
WHERE id = 1
RETURNING username, email;

-- Update from another table
UPDATE orders SET status = 'completed'
FROM users
WHERE orders.user_id = users.id AND users.username = 'john_doe';

DELETE Operations

-- Basic delete
DELETE FROM users WHERE id = 1;

-- Delete with condition
DELETE FROM users WHERE created_at < '2023-01-01';

-- Delete with returning
DELETE FROM users WHERE username = 'old_user'
RETURNING id, username;

-- Truncate table (faster for all rows)
TRUNCATE TABLE users;
TRUNCATE TABLE users RESTART IDENTITY; -- Reset sequences

Index Operations

-- Create index
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_users_username ON users(username);

-- Unique index
CREATE UNIQUE INDEX idx_users_email_unique ON users(email);

-- Partial index
CREATE INDEX idx_active_users ON users(username) WHERE active = true;

-- Functional index
CREATE INDEX idx_users_lower_email ON users(LOWER(email));

-- Composite index
CREATE INDEX idx_orders_user_status ON orders(user_id, status);

-- List indexes
\di

-- Drop index
DROP INDEX idx_users_email;

-- Reindex
REINDEX INDEX idx_users_email;
REINDEX TABLE users;

View Operations

-- Create view
CREATE VIEW active_users AS
SELECT * FROM users WHERE active = true;

-- Materialized view
CREATE MATERIALIZED VIEW user_stats AS
SELECT
DATE_TRUNC('month', created_at) as month,
COUNT(*) as user_count
FROM users
GROUP BY DATE_TRUNC('month', created_at);

-- Refresh materialized view
REFRESH MATERIALIZED VIEW user_stats;

-- Drop view
DROP VIEW active_users;
DROP MATERIALIZED VIEW user_stats;

-- List views
\dv

Function and Procedure Operations

-- Create function
CREATE OR REPLACE FUNCTION get_user_count()
RETURNS INTEGER AS $$
BEGIN
RETURN (SELECT COUNT(*) FROM users);
END;
$$ LANGUAGE plpgsql;

-- Function with parameters
CREATE OR REPLACE FUNCTION get_user_by_id(user_id INTEGER)
RETURNS TABLE(username VARCHAR, email VARCHAR) AS $$
BEGIN
RETURN QUERY
SELECT u.username, u.email
FROM users u
WHERE u.id = user_id;
END;
$$ LANGUAGE plpgsql;

-- Call function
SELECT get_user_count();
SELECT * FROM get_user_by_id(1);

-- Drop function
DROP FUNCTION get_user_count();

-- List functions
\df

User and Permission Operations

-- Create user
CREATE USER myuser WITH PASSWORD 'mypassword';

-- Create role
CREATE ROLE myrole;

-- Grant permissions
GRANT SELECT, INSERT ON users TO myuser;
GRANT ALL PRIVILEGES ON DATABASE mydatabase TO myuser;

-- Revoke permissions
REVOKE INSERT ON users FROM myuser;

-- List users
\du

-- List permissions
\dp users

Transaction Operations

-- Begin transaction
BEGIN;

-- Commit transaction
COMMIT;

-- Rollback transaction
ROLLBACK;

-- Savepoints
BEGIN;
SAVEPOINT sp1;
-- Some operations
ROLLBACK TO sp1;
COMMIT;

-- Transaction isolation levels
BEGIN ISOLATION LEVEL READ COMMITTED;
BEGIN ISOLATION LEVEL REPEATABLE READ;
BEGIN ISOLATION LEVEL SERIALIZABLE;

Advanced Query Operations

JSON Operations

-- JSON data type
CREATE TABLE documents (
id SERIAL PRIMARY KEY,
data JSON
);

-- Insert JSON data
INSERT INTO documents (data) VALUES
('{"name": "John", "age": 30, "city": "New York"}');

-- Query JSON
SELECT data->>'name' as name FROM documents;
SELECT * FROM documents WHERE data->>'age' = '30';

-- JSONB operations
CREATE TABLE documents_b (
id SERIAL PRIMARY KEY,
data JSONB
);

-- JSONB indexing
CREATE INDEX idx_documents_data ON documents_b USING GIN (data);

Array Operations

-- Create table with arrays
CREATE TABLE posts (
id SERIAL PRIMARY KEY,
title VARCHAR(100),
tags TEXT[]
);

-- Insert array data
INSERT INTO posts (title, tags) VALUES
('First Post', ARRAY['tech', 'programming', 'sql']);

-- Query arrays
SELECT * FROM posts WHERE 'tech' = ANY(tags);
SELECT * FROM posts WHERE tags @> ARRAY['tech'];
SELECT unnest(tags) as tag FROM posts;
-- Basic full-text search
SELECT * FROM documents WHERE to_tsvector(content) @@ to_tsquery('postgresql');

-- Create text search index
CREATE INDEX idx_documents_fts ON documents USING GIN (to_tsvector('english', content));

-- Ranking results
SELECT title, ts_rank(to_tsvector(content), to_tsquery('postgresql')) as rank
FROM documents
WHERE to_tsvector(content) @@ to_tsquery('postgresql')
ORDER BY rank DESC;

System and Maintenance Operations

Database Information

-- Show version
SELECT version();

-- Show current user
SELECT current_user;

-- Show database size
SELECT pg_size_pretty(pg_database_size('mydatabase'));

-- Show table sizes
SELECT schemaname, tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

-- Show running queries
SELECT pid, now() - pg_stat_activity.query_start AS duration, query
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';

Backup and Restore

# Dump database
pg_dump mydatabase > mydatabase_backup.sql

# Dump with compression
pg_dump -Fc mydatabase > mydatabase_backup.dump

# Restore database
psql mydatabase < mydatabase_backup.sql

# Restore from custom format
pg_restore -d mydatabase mydatabase_backup.dump

# Dump specific table
pg_dump -t users mydatabase > users_backup.sql

Performance and Statistics

-- Update table statistics
ANALYZE users;
ANALYZE; -- All tables

-- Vacuum table
VACUUM users;
VACUUM FULL users; -- More thorough but locks table

-- Auto vacuum settings
SELECT name, setting FROM pg_settings WHERE name LIKE 'autovacuum%';

-- Index usage statistics
SELECT schemaname, tablename, indexname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes;

-- Table statistics
SELECT schemaname, tablename, n_tup_ins, n_tup_upd, n_tup_del
FROM pg_stat_user_tables;

Common psql Commands

# Connection
psql -h hostname -U username -d database

# psql meta-commands
\? # Help on psql commands
\h # Help on SQL commands
\h SELECT # Help on specific SQL command

# Database and table info
\l # List databases
\c database # Connect to database
\dt # List tables
\di # List indexes
\dv # List views
\df # List functions
\du # List users
\dp table # List table permissions

# Table structure
\d table # Describe table
\d+ table # Describe table with details

# Settings
\set # Show all variables
\timing on # Show query execution time
\x # Toggle expanded display

# File operations
\i filename # Execute commands from file
\o filename # Send output to file
\o # Send output back to stdout

# Quit
\q # Quit psql

Best Practices

Performance Optimization

  1. Use appropriate data types - Don't use TEXT when VARCHAR(50) suffices
  2. Create proper indexes - On frequently queried columns
  3. Use EXPLAIN ANALYZE - To understand query performance
  4. Regular maintenance - VACUUM and ANALYZE regularly
  5. Connection pooling - Use tools like PgBouncer for high-traffic applications

Security

  1. Use roles and permissions - Principle of least privilege
  2. Enable SSL - For all connections
  3. Regular updates - Keep PostgreSQL updated
  4. Row-level security - For multi-tenant applications
  5. Audit logging - Track database access and changes

Data Integrity

  1. Use constraints - Foreign keys, check constraints, not null
  2. Transactions - Wrap related operations in transactions
  3. Backup strategy - Regular backups with tested restore procedures
  4. Monitoring - Set up alerts for disk space, connections, performance

Development

  1. Use prepared statements - Prevent SQL injection
  2. Connection management - Properly close connections
  3. Error handling - Handle database errors gracefully
  4. Testing - Test database operations thoroughly
  5. Migration scripts - Version control database schema changes

PostgreSQL is a powerful and feature-rich database system that excels in complex applications requiring data integrity, advanced queries, and extensibility. While it may have a steeper learning curve than MySQL, its robustness and capabilities make it an excellent choice for serious applications.